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Abstract 


This paper compares the Value-at-Risk (VaR) forecasts delivered by alterna¬ 
tive model specifications using the Model Confidence Set (MCS) procedure 
recently developed by Hansen et al.l ( 201llf . The direct VaR estimate pro- 
vided by the Conditional A utoregressive Value-at-Risk (CAViaR) models of 
lEngle and Manganelli ( 2004h are compared to those obtained b y the popular 
Autoregressive Conditional Heteroskedasticity (ARCH) models of lEng M (Il982h 
a nd to t h e rec ently intro duced Genera lised Autoregressive Score (GAS) models 
of ICreal et al.l ( 2013 ) an d]H arvevl J20131) . The Hansen’s procedure consists on a 
sequence of tests which permits to construct a set of “superior” models, where 
the null hypothesis of Equal Predictive Ability (EPA) is not rejected at a certain 
confidence level. Our empirical results, suggest that, after the Global Financial 
Crisis (GFC) of 2007-2008, highly non linear volatility models deliver better 
VaR forecasts for the European countries as opposed to other regions. The R 
package MCS is introduced for performing the model comparisons whose main 
features are discussed throughout the paper. 


Keywords: Hypothesis testing, Model Confidence Set, Value-at-Risk, VaR 
combination, ARCH , GAS, CAViaR models. 


1. Introduction 


During last decades hundred of models have been developed, estimated and val¬ 
idated from both an empirical and theoretical perspective. As a result, several 
alternative model specifications are usually available to the econometricians to 
address the same empirical problem. Just to confine our considerations within 
a given family, and without claiming to be com p lete, the A utoregressive Co ndi- 
tional Heteroskedastic (ARCH) models of Engle ( 1982 ) and Bollerslev ( 19861) . for 
example, have seen an exponentially increasing number of different specifications 
in the last few decades. Despite their popularity, they do not exhaust the set of 
models introduced for dynamic conditional volatility model ling wh ich includes 
also the stochastic vol atility models initially propo sed bv Tavloil (19941) and 
extensively studied by Harvev and Shephardl dl99 (>; and Gallant et al.1 (|1997 ) 
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within the context of non linear state space models. The family of dynamic 
conditional vo latil ity models has bee n recently enlarged by the GAS model of 
lHarvevI (j2013il and lCreal et al.i (|2013i l also known as Dynamic Conditional Score 
(DCS). The availability of such an enormous number of models raises the ques¬ 
tion of providing a statistical method or procedure that delivers the “best” 
model with respect to a given criterium. Furthermore, a model selection is¬ 
sue appears when the usual comparing procedures does not deliver an unique 
result. This could happen for example, when models are compared in terms 
of their predictive ability, so that models that produce better forecasts are pre¬ 
ferred. Unfortunately, when evaluating the performances of different forecasting 
models it is not always trivial to establish which one clearly outperforms the re¬ 
maining available alternatives. This problem is particularly relevant even from 
an empirical perspect ive especially when the s et of competing alternati ves is 
large. As observed bv lHansen and Lundel ( 2005ll and lHansen et al. ( 2011 1. it is 
unrealistic to expect that a single model dominates all the competitors either 
because the different specifications are statistically equivalent or because there 
is not enough information coming from the data to univocally discriminate the 
models. 

Recently, several alternative testing procedures have been de veloped to de¬ 
liver the “best fitting” model, see e.g. the Reality Check ( RC) of Whitel ( 2000l) , 
the Stepwise Multiple Testing proced ure oflRomano and Wolf ( 200511 . the Su¬ 


perior Predictive Ability (SPA) test of Hansen and Lundel (20051 and the Con¬ 


ditional Predictive Ability (CPA) test of Giacomini and White 2006h . Among 
th ose multiple-t e sting p rocedure s , the Mode l Confidence Set pro cedure (MCS) 
of Hansen et al. ( 2003 ). [h ansenl (l2005l l and Hansen et al.1 (l201lh consists of a 
sequence of statistic tests which permits to construct, the “Superior Set of Mod¬ 
els” (SSM), where the null hypothesis of equal predictive ability (EPA) is not 
rejected at certain confidence level a. The EPA statistic test is evaluated for an 
arbitrary loss function, which essentially means that it is possible to test mod¬ 
els on various aspects depending on the chosen loss function. The possibility to 
specify user supplied loss functions enhances the flexibility of the procedure that 
can be used to test several different aspects. The MCS procedure starts from an 
initial set of m competing models, denoted by M°, and results in a (hopefully) 
smaller set of superior models, the SSM, denoted by M*_„. Of course, the best 
scenario is when the final set consists of a single model. At each iteration, the 
MCS procedure tests the null hypothesis of EPA of the competing models and 
ends with the creation of the SSM only if the null hypothesis is accepted, other¬ 
wise the MCS is iterated again and the EPA is tested on a smaller set of models 
obtained by eliminating the worst one at the previous step. 

This paper compares the Value-at-Risk (VaR) forecasts delivered by alter¬ 
native model specifications recently introduced in_the finan cial econometri c lit- 
erature, using the MCS p rocedure, similarly to lCaporin and McAleeil ( 2014f) and 
Chen and Gerlach ( 2013ll . More specific ally, the direct quantile estim ates ob¬ 


tained by the dynamic CAViaR models of lEngle and Manganellil (1200411 ar e com- 
pared with the VaR forecasts delivered by several ARCH-type models of[Engle 


2 

































































( 19821) and Bollerslev ( 19861) and with those o btai ned by two diff erent specifica¬ 


tions of t he GAS models of Creal_et_aD ( 2013 ) and Harvev ( 2013lf . The CAViaR 

M) 


model of Engle and Manganellil ( 20041 has been proven, see e.g. Chen et ahl 


(2012T). to provide reliable quantile-based VaR estimates in several emp irical 
experi ments. During the last few decades, the ARCH-type models of lBollerslevi 
1 19861) have became a standard approach to model the conditional volatility dy¬ 
namics. Those approaches are compared to the new class of score driven models, 
which are p r omisi ng in modelling high ly nonlinear volatility dynamics, see e.g., 
Creal et ah ( 20131 ) and Harvev ( 20131) . Up to our knowledge, VaR forecasting 


comparisons using the MCS procedure has not been considered using nonlinear 
GAS dynamic models. Our empirical results, suggest that, after the Global Fi¬ 
nancial Crisis (GFC) of 2007-2008, highly non linear volatility models deliver 
better VaR forecasts for the European countries. On the contrary, for the North 
America and Asia Pacific regions we find quite homoge neous re sults with respec t 
to the models’ complexity. As discussed in Kowalski and Shachmurove 2014|), 
this empirical finding is consistent with the greater impact that the GFC had 
in the European financial markets as compared to the non-Euro areas. 

The models belonging to the superior set delivered by the Hansen’s proce¬ 
dure can then be used for different purposes. For example, they can be used 
to forecast future volatility levels, to predict the future observations, condi¬ 
tional to the past inf ormatio n, or to deliver future Value-at-Risk estimates, 
as argued bv iBernardi et alj (2014). 


Alternatively, the models can be com- 
b ined togeth e r to obtain better forecast measures. Since the original work of 
Bates and Granger (1969), a lot of papers have argued that combining predic¬ 


tions from alternative models often improves upon forecasts based on a single 
“best” model. In an environment where observations are subject to structural 
breaks and models are subject to different levels of misspecification, a strategy 
that pools information coming from different models typically performs better 
than methods that try to select the best forecasting model. Our analysis also 
consi ders the Dynamic Model Averaging technique proposed by Bernard! et al 


(2014) in order to aggregate the VaR forecasts delive red by the SS M, co nditional 
on model’s past out-of- s ample performances as in ISamuels and Sekkell (120111 ) 
and Samuels and Sekkell ( 20131) . For further information about the application 


of the model averaging methodology the reader is referred to IBernardi et al 
( 2014 ). Our results confirm that, under an optimal combination of models, in¬ 


dividual VaR forecasts can be substantially improved with respect to standard 
backtest measures. 

A nother r elevant co ntribution of the paper concerns the development of the 
R ( R Development Core Team . 20131) package MCS. The MCS package provides 
an integrated environment for the comparison of alternative mod els o r m ode l’s 
specifi cations within the same family using the MCS procedure of lHansen et al.1 

( 201 ll) and it is available on the CRAN repository, http : //cran. r-project. org/web/packages/MCS/index.htm 

The MCS package is very flexible since it allows for the specification of the model’s 

types and loss functions that can be supplied by the user. This freedom allows 

for the user to concentrate on substantive issues, such as the construction of the 

initial set of model’s specifications, M°, without being limited by the constraints 
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imposed by the software. _ 

Th e layout of the paper is as follows. In Section[2]we present the lHansen et al. 
( 201 if ) ’s MCS procedure highlighting the alternative specifications of the test 
statistics. Section [3] details about the alternative model specifications used to 
compare VaR forecasts. Section 0] presents the main features of the MCS package. 
Section [3] covers the empirical application which also aims at illustrating the im¬ 
plementation of the procedure using the provided package. Section [6] concludes 
the paper. 


2. The MCS procedure 

The availability of several alternative model specifications being able to ad¬ 
equately describe the unobserved data generating process (DGP) opens the 
question of selecting the “best fitting model” according to a given optimality 
criterion. This paper implements the MCS procedure recently developed by 
Hansen_et al. ( 201lh with the aim to compare the VaR forecasts obtained by 
several alternative model specifications. The Hansen’s procedure consists of a 
sequence of statistic tests which permits to construct a set of “superior” models, 
the “Superior Set Models” (SSM), where the null hypothesis of equal predictive 
ability (EPA) is not rejected at a certain confidence level a. The EPA statistic 
tests is calculated for an arbitrary loss function that satisfies general weak sta- 
tionarity conditions, meaning that we coul d test mo dels on various aspects, as, 
for example, punctu al forecasts, as in Hansen and Lundel ( 20051) . or in-sample 
goodness of fit, as in Hansen et al. ( 201 lh . Formally, let v* denote the observa¬ 
tion at time t and let be the output of model i at time t, the loss function 
£i,t associated to the i-th model is defined as 


£i,t — £ (yt, Vi,t) ■ 


(l) 


and measures the difference between the output yj_t and the “a post eriori” re¬ 


alisation y t . As an example of loss function , Bernardi et al. (2014) consider 


the asymmetric VaR loss function of Gonzalez-Rivera et al. ( 20041) to compare 


the ability of different GARCH specifications to predict e xtreme loss in high fre¬ 


quency return data. The asymmetric VaR loss function of lGonzalez-Rivera et al 
(2004) is defined as 


£ {yt , VaR[) = (r - d T t ) {y t - VaR[), 


( 2 ) 


where VaR^ denotes the r level predicted VaR at time t , given information up 
to time t— 1, T t - i, and d[ = 1 {y t < VaR) - ) is the r level quantile loss function. 
The asymmetric VaR loss function represents the natural candidate to backtest 
quantile-based risk measures since it penalises more heavily observations below 
the r-th quantile level, i.e. yt < Va R f . Details about the loss function specifi¬ 
cations can be found in Hansen and Lunckj ( 2005 ) and in the following Section 
£2 

We now briefly describe how the MCS procedure is implemented. The pro¬ 
cedure starts from an initial set of models M° of dimension m, encompassing all 
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the alternative model specifications, and delivers, for a given confidence level a, 
a smaller set, the superior set of models (SSM), MJ_ a , of dimension m* < m. 
The SSM, M*_ q , contains all the models having superior predictive ability ac¬ 
cording to the selected loss function. Of course, the best scenario is when the 
final set consists of a single model, i.e. m* = 1. Formally, let di jt denote the 
loss differential between models i and j: 

d y,f @-i,t 7 b j !)*•*) ^ 1) • • • 7 ?^7 (3) 


and let 

di-.t = (jn — l) -1 ^2 dij >t , i = (4) 

be the average loss of model i relative to any other model j at time t. The EPA 
hypothesis for a given set of models M can be formulated in two alternative 
ways: 


Ho,M • Cij — 0} 

for all i,j = 1,2,.. 

., m 


Ha,M • Cij ^ 0? 

for some i,j = 1,.. 

.,ra, 

(5) 


Ho,m : Cj. = 0, 

for all i = 1,2,.. 

., m 


Ha.m : Cj. / 0, 

for some i = 1,.. 

• ,m, 

(6) 


where Cy = E (dy) and Cj. = Efdj.) are ass umed to be finite and time in¬ 


dependent. According to IHan sen et al.l (2011), the two hypothesis defined in 


equations © © can be tested by constructing the following two statistics 


tij — 


ti. = 


di 


ij 


var (dy) 

di¬ 


va,! 


(<■) 


(7) 

( 8 ) 


for i,j £ M, where dj,. = (m — 1) _1 dy is the average loss of the i-th 

model relative to the average losses across the models belonging to the set M, and 
dij = m -1 YJiU dij,t measures the relative average loss between models i and j. 
The variances var (d,;..) and var (dy) are bootstrapped estimates of var (d^.) and 
var(dy), respectively. The bootstrapped variances vaf(dj ; .) and var(dy) are 
calculated by performing a block-bootstrap procedure where the block length p 
is set as the maximum number of significants parameters obtained by fitting an 
AR(p) process on the dy terms. In the MCS package the user can either specify 
an arbitrary block length p or use the default AR(p) procedure. Details about 
the implemented bootstrap p r ocedu r e can be found in White J 2000 Kilian 
( 1999h . Clark and McCracken ( 2001 ). Hansen et al.1 ( 2003 ). Hansen and Lunde 
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( 20051) . Hansen et al. ( 2011 ) and Bernardi et al. ( 2014 ). The statistic tg is used 
in the well know test for comparing two forecasts; see e.g., Diebold and Marianol 
( 2002h a nd West (1996 ), while the second one is use d in Hansen et al. ( 200^h 
Hansen ( 20051) . Hansen et al. ( 2011 ). As discussed in Hansen et al.1 ( 201 ll) . the 


two EPA null hypothesis presented in equations (0 © map naturally into the 
two test statistics 


Tr,m = max | tij | and T max ,M 
’ i,je m J 


max. ti., 

i£ M 


(9) 


where tij and t*. are defined in equation ® ©• Since the asymptotic distribu¬ 
tions of the two test statistics is nonstandard, the relevant distributions under 
the null hypothesis is estimated using a bootstrap procedure similar to that 
used to estimate var (dj. ) t and var (dg ). For further details about the boot ¬ 
strap proced ure, see e.g., IWhitel (120001) lHansen et al.l (l2003l) . iHansenl (120051) . 
Kilianl ( 19991) and Clark and McCrackenl (|2001 ). 


As said in the Introduction, the MCS procedure consists on a sequential 
testing procedure, which eliminates at each step the worst model, until the 
hypothesis of equal predictive ability (EPA) is accepted for all the models be¬ 
longing to the SSM. At each step, the choice of the worst model to be eliminated 
has been made using an elimination rule that is coherent with the statistic test 
defined in equations 0 0 which are 


eR,M = argmax 

i 



e m ax,M = argmax 
ieM 



( 10 ) 


respectively. Summarazing, the MCS procedure to obtain the SSM, consists of 
the following steps: 

1. set M = M°; 

2. test for EPA-hypothesis: if EPA is accepted terminate the algorithm and 
set M|_ a = M, otherwise use the elimination rules defined in equation 
(usd to determine the worst model; 

3. remove the worst model, and go to step 2. 

Since the Hansen’s procedure usually delivers a SSM, MJ_ a , which contains a 
large number of models, in the next sections, we also describe how to implement 
a procedure that combines the VaRs forecasts. 



3. Model specifications 

In our empirical illustration we apply the MCS procedure detailed in Section [5] 
to compare the VaR forecasts obtained by fitting a list of popular models intro¬ 
duced in the econometric literature over the last few decad es. The aut oregressive 
conditional heteroskedastic models, introduced bv lEnglei ( 1982 ) and Bollerslev 
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( 19861) . are probably among the most widely employed tools in quantile-based 
quantitative risk management. Here, we consider the ARCH-type models not 
only because of their popularity, but also because of their ability to account 
for the main styl ised facts about financial returns. Moreover, since the seminal 
paper of Engle! ( 1982h . hundreds of different specifications have been proposed, 
some of them having a huge flexibility in handling series having different char¬ 
acteristics. Despite their vast popularity, the ARCH models are principally fo¬ 
cused on the scale of the modell ed c onditiona l distributions. The GAS models, 
introduced bv ICreal et al.l ( 2013lf and [Harvey ( 20131) . have been recently gaining 
popularity also because they nest some of the traditional dynamic conditional 
variance approaches enlarging the class of time-varying parameter models. For 
the purposes of this paper, the ARCH family of models is discussed in Section 
13.11 while GAS models are considered in Section 13.21 ARCH-type and GAS 
models provide an indirect estimation of the quantile-based risk measures be¬ 
cause of the parametric assumption of the conditional distribution. A way to 
overcome the need to specify the conditional distribution is to model directly 
the quant ile as in the Conditional Aut oregressive Value-at-Risk (CAViaR) ap¬ 
proach of lEngle and Manganellil ( 20041 ). The CAViaR specifications are briefly 
discussed in Section EP1 

Let y t be the logarithmic return at time t, throughout the paper, we con¬ 
sider the following general formulation encompassing all the considered model 
specifications 

yt | t = 1,2,... ,T, (11) 

Ct ~ h (C*—i) ■ • ■) Ct —pi yt— i) • • ■ j yt—pi I -Ft— i); (1^) 

where Ft is the information set up to time t, Q is a vector of time-varying 
parameters, •& is a vector or static parameters, Ct-i, ■ ■ ■, Ct-p and J/t-i, ■ ■ ■, yt-p 
are lagged values of the dynamics parameters and the observations, up to order 
p > 1, respectively. Finally, the function h(-) refers to one of the dynamics 
reported below, while T> (•) is a specified density function. Moreover, all the 
mod el parameters are estimate d by maximising the log-likelihood function, see 
e.g. Franca and Zakoianl ( 20lH ). 


3.1. ARCH models 

ARCH-type models are flexible and powerful tools for conditional volatility 
modelling, because they are able to consider the volatility clustering phenomena 
as well as other established stylised facts. Models belonging to this class, have 
been principally proposed in order to describe the time-varying nature of the 
conditional volatility that characterises financial assets. They have also become 
one of the most used tools for researchers and practitioners dealing with financial 
market exposure. The simplest conditional volatility dyna mics we conside r in 
this paper is the GARCH(p,q) specification introduced bv iBollerslev ( 1986 ) 


i=i 


9 

= w + X] ai£ t-i -1 + Pi a t-j -1 
i=i 


(13) 
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where Et-i = yt-i — y, for i = 1,2,... ,p and t = 1,2,... ,T, y £ 5ft is the 
mean of y t , w > 0, 0 < a* < 1, V* = 1, 2,... ,p and 0 < /3j < 1 ,Vj = 1 , 2 ,... ,q 
with P = X)i=i a + EJ=i P < 1 to preserve weak ergodic stationarity of the 
conditional variance. Sometimes it is possible to observe high persistence in 
financial time series volatility, i.e. it is possible to observe series for which 
P ps 1. To account for this scenario the IGARCH(p,q) specification, where the 
persisten ce parameters is impo sed to be exactly 1, i.e P = 1, has been pro¬ 
posed by lEngle and Bollerslevl ( 1986i ). Despite their popularity, the GARCH 
and the IGARCH specifications are not able to account for returns exhibit¬ 
ing higher volatility after n egativ e shocks than after positive ones as theorised 
by the “leverage effect” of Black! ( 1976h . Consequently, in the financial econo¬ 
metric literature several a lterna t ive sp ecifications have been proposed. The 
EGARCH(p,q) model of Nelso n (Il99lh . for example, assumes that the condi¬ 
tional volatility dynamics follows 


p _ 

log (of) = w + V +7i -E|a7 t _j|)] +y'&log(of_ J .) , (14) 
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i—1 


1=1 


where vJt-i = ^- 7 , for * = 0,1,..., p and t = 1,2, The asymmetric 

response is introduced through the 7 \ parameters: for 7 , < 0 negative shocks 
will obviously have a larger impact on future volatility than positive shocks of 
the same magnitude. Note that no positivity constraints are imposed on the 
parameters a* ,Pj,ji- For the EGARCH(p,q) specification the persistence pa¬ 
rameter P is equal to P = Xq=i Pj- One of the most flexible models belonging 
to the ARCH famil y is t he Asym metric-Power-ARCH(p,q) (APARCH, hence¬ 
forth) model of Ding et ahl d 19931') which imposes the following dynamic to the 
conditional variance 


p 

af = ui + a.i — 'fiEt-i) + 

i -1 


pj a t~j > 


(15) 


j=i 


wher e the S parameter plays the role of a Box-Cox transformation fsee lBox and Cox 
1964). To ensure the positiveness of the conditional variance the following pa¬ 
rameter restrictions are imposed: w > 0, S > 0, 0 < 7 * < 0 for i = 1 ,... ,p and 
the usual conditions ct; > 0, and 0j > 0, for i,j = 1, 2,.. ., max {p, q}. In the 
APARCH specification the persistence strongly depends upon the distributional 
assumption made on y t , i.e. 


P = y^ajKj 


i= 1 


-±pp 

1=1 


(16) 


where Hi = E \\wt\ — 7 iWt\ , for i = 1,... ,p. The APARCH specification results 
in a very flexible model that nests several of the most popular univariate ARCH 
parameterisations, such as 


(i) the GARCH(p,q) of [Boller slev (119861) . for 6 = 0 and 7 * = 0, for i = 
1,2,...,p; 
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(ii) the Absolute-Value-GARCH (AVARCH, henceforth) s pecifi cation, for S = 
1 and 7 j = 0 for i = 1, 2,... ,p, proposed by iTavlor ( 1986 ) and Schwert 
(Il990h to mitigates the influence of large, in an absolute sense, shocks with 
respect to the traditional GARCH specification; 


(in) the GJR-GARCH model (GJRGARCH, henceforth) of Glosten et al. ( 19931) . 
for 8 = 2 and 0 < 7 * < 1 for i = 1 , 2 ,... ,p; 

(iv) the Threshold GARCH (TGARCH, henceforth) of IZakoiaii (1994), for 
<5=1, which allows different reactions of the volatility to different signs 
of the lagged errors; 

(v) the Nonlinear GARCH (NGARCH, henceforth) of Higgins and Bera ( 1992h . 
for 7 i = 0 for i = 1,2,... ,p and /3j = 0 for j = 1, 2,..., q. 

Another interesting specification is the Component-GARCH(p,q) of Engle and Lee! 
( 19931 ) (CGARCH, henceforth) which decomposes the conditional variance into 
a permanent and transitory component in a straightforward way 


°t =& + J2 ai ( £ t~i - &-0+ Y fa ( °t~j - 6-i) 

»=1 3=1 

& = w + i + v (4-i - 4-0 > 


(17) 


where in order to ensure the stationarity of the process we impose X 4 =i + 
J ,j—i < 1 and the additional condition that p < 1. Further parame- 
ters rest ri ction s foiythe positiveness of the conditional variance are given in 
lEngle and Lee ( 19931) . This solution is usually employed because it permits 
to investigate the long and short-run movements of volatility. The considered 
conditional volatility models are a minimal part of the huge number of specifi¬ 
cations available in the financial econometric literature. We chose these models 
because of their heterogeneity, since each of them focuses on a different kind of 
stylised fact. Moreover, even if they could seem very similar, the way in which 
they account for the stylised fact changes. For a very extensive and up to date 
survey on GARCH mode l s we will refer the rea d er to the works of Bollerslev 
( 200S ). Terasvirtal ( 20091) . iBauwens et al.1 (120061). [silvennoinen and Terasvirta 
(2009) and the recent book of Franca and Zakoianl ( 201 H) . 


3.2. GAS models 

The GAS framework recently introduced by ICreal et~ah ( 2013 ) and Harvev 
( 20131) is gaining lots of co nsideration by ec onometricians in many field of time 
series analysis. Under the ICox et al. ( 198ll) classification the GAS models can 
be considered as a class of observation driven models, with the usual conse¬ 
quences of having a closed form for the likelihood and ease of evaluation. The 
key feature of GAS models is that the predictive score of the conditional density 
is used as forcing variable into the updating equation of a time-varying param¬ 
eter. Two main reasons for adopting this updating procedure has been given in 
literature. lHarvev (201 3:). for example, argues that the GAS specification can be 
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seen as an approximation to a filter for a models d riven by a s t ochast ic latent 


parameter that is, by definition, “unobservable”. Creal et al. ( 20131) instead 


consider the conditional score as a steepest ascent direction for improving the 
model’s local fit given the current parameter position, as it usually happens into 
a Newton-Raphson algorithm. Moreover, the flexibility of the GAS framework 
make this class of models nested with an huge amount of famous eco nomet¬ 
rics m odel s such as, for ex ample, some of the ARCH-type models of lEngle 
( 1982h and Bollerslev J 1986h for vo latility modelling , and als o the MEM, ACD 
and A CI models o f Engle ( 2002h . lEngle and Gallol ( 2006h . Engle and Russell! 
( 1998h and Russell ( 19991) . respectively. Finally, one of the practical implica¬ 
tions of using this framework in order to update the time-varying parameters 
is that it avoids the problem of using a non-adequate forcing variable when 
the choice of it is not so obvious. In fact, it may be argued that, the use of 
the conditional score vector, allows for the dynamic parameter to be updated 
considering all the information coming from the entire distribution as usually 
happen into a state space framework. On the contrary, many others observation 
driven models, such as ARCH -type models, only use the expected value of the 
conditional distribution. GAS model applications range in severa l i nteresting 
areas, s uch as, risk measure and dep endence modelling, Bcrnardi and Catanial 
( 2015T) . Lucas and Zhand ( 2014) and Salvatierra and Patton ( 20141) . volatility 
modelling, Harvey and Sucarratl ( 2014), and in a nonlinear autoregres sive set¬ 
ting, by Koopman et al.1 ( 2015 ) and Delle Monache and Petrella ( 20141) . 

Formally, let us consider the general model specification in Equations (EH) 
E2D, where, as before, V (•) denotes a probability density, Ct is a set of time- 
varying parameters and $ is a vector of time-independent parameters. For 
example, V (•) may be a Student-t distribution with fixed degree of freedom 
(A = v) and time-varying volatility (Ct = at). Then, the updating equation for 
the time-varying parameters according to the GAS framework is 

Ct+i = + a^St + t 

s t = St (Ct | d) Vf (Ct|0), 

where V f (Ct | t?) is the conditional score of the pdf V (•), evaluated at Ct 

31n2?(C* |0) 


v t (Ct I 0) = 


dQ 


and St (Ct | t?) is a positive definite, possible parameter-dependent scaling ma¬ 
trix. A convenient choice for the scaling matrix St (Ct | 0) is usually given by 
the Fisher information matrix 


St (Ct I tf) = [i(Ct I tf)r 


(18) 


where T (Ct | •&) can be written as: 


z(Ct |0) 


-E 


t-1 


d 2 In V (Ct | tf) 
dCtdtf 


= Et_ i 


Vt (Ct | tf) x Vt (Ct 



(19) 
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and a is usually set equal to {0, 1}. Note that fo r a = 0 the scaling matrix 

S t (C t | $) coincides with the identity matrix. ICreal et all d2013h suggest to use 
the inverse Fisher information matrix (a = 1) or its (pseudo)-inverse square 
root (o; = 1) in order to scale the conditional score for a quantity that accounts 
for its curvature (variance). In our empirical tests we find this way much more 
efficient than using an identity scaling matrix. However, sometimes the Fisher 
information matrix is not available in closed form, and simulation and numerical 
evaluation techniques should be used to approximate the hessian matrix. 

Another interest property of the GAS framework is that it adapts quite 
naturally to various reparameterisations of the problem in hand. This aspect 
is quite useful when the natural parameters’ space is constrained into a subset 
of the real line 1ft, and therefore a mapping function i) (•) between this and 
the real line, becomes necessary. For example, let us define 3 £ 5ft as the 
natural parameters’ space and i9 : 5ft —> 3 an absolutely continuous deterministic 
invertible mapping function that maps the real line into the natural parameter 
space 3. In general we consider the modified logistic function defined by 

m (L,U) (x) = L + e L j , (20) 


which maps 1ft into the interval (L,U). Moreover, let us define (t = Tn -1 (£ t ) as 
the unmapped version of the parameter <C t , then the GAS model suited for the 
new time-varying parameter Q with a = 1 is defined as 


Ct+i — St + A;Ct, 


( 21 ) 


where §t = rritSti and m* = . 


For other possible choices of a we refer to 


Greal et ahl ( 201,* lll h 


In the empirical application, we will consider the GAS specification for the 
parameters of the Gaussian and Student-t distributions. Formally, the Gaussian 
GAS model for the parameters £ t N = (/xt,o?) is given by: 


Vt I CM ~ N {nt, °t), t = 1,2,..., T, 


with 


(IM+ A _ 

Uw - 


L0 a 2 


+ 


a 


V- 'J 

0 a a 2 


s t 


'P» 0 

0 P* 


fJ-t 

*l 1 ’ 


where 5? = log (of) and borrowing the previous notation 

£ 

V ^ J 



1 


' 1 

0 ' 


mt = 

1/5?. 

, 2t = 

G t 

0 


, V t = 


(yt-Mt) 

-^- 


The Student-t GAS model with time varying location, scale and shape param¬ 
eters C t T = (/J-tAt^t)' is given by: 

Vt I (^t-i,C t T >^) ~ T (iH,<l>t, v t) , t = l,2,...,T, 
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with 


Mt+i\ 


$+1 

= 

yt+ij 

\ u «. 


£V 0 
0 (x^ 

0 0 


where $ = log (</> 2 ), v t = log ( v t ), and 

*ht= ( l /4>t 

V/Ztj 



0 p 



/ Vt.+1 

Ct+3) 


It. = 


v t = 


0 


and 


20f(i/ t +3) 

V ® 2^(i/ t +3)(j/ t + l) 

/ (^t+l)(r t -Mt) 

_J_ I (t'H-tXn-Mt) 

<kt ' Vt</>t+4>t(r t -fJ , t ) 


hi 


0 

2^(i/ t +3)(i/ t + l) 




h 2 (r t ,fJ,t,<t%,vt) = . 2 


2 

t't + 1 


2 ‘° S [ 1 + 


Z4t (t't + 3) (v t + 1) 

2 ^ V 2 / 2i/ t 

(»~t - Mt) 2 \ (^t + 1) (n - Mt) 2 


/ 2z/ t (p 2 v t + (r t - /q) 2 ) 


where if (x) and if' (a:) are the digamma and the trigamma functions, respec¬ 
tively. Recently, an OxMetrics package providing fu nctions to esti mate various 
specifications of GAS models has been developed bv lAndres ( 2014 ). 


3.3. Dynamic quantile models 


The CAViaR models of lEngle and Mang anelli| (| 2004f). extends t h e stan dard 
quantile regression model introduced by Koenker and Bassett Jrl ( 19781 ) and 
belongs to the family o f dynamic quantile autoregressive models proposed by 
Koenker and Xiao ( 20061) . Formally, let f t (f3 T ) = ft (x t _ i,/3 r ) denote the r-th 
level conditional quantile at time t of the observed variable yt conditional to 
the information available at time t — 1, i.e. x t _i and the vector of unknown 
parameters (3 T , the generic CAViaR specification for the r-th quantile can be 
written as: 


Vt — ft (/3 t ) + £t, 


t = l,2,...,T, 


ft (Or) = 0 T + E 9 hft-i (Pt) + E W ( X ‘-^) ’ 

1=1 


i=l 


( 22 ) 

(23) 
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where 9 = ( 9q , Oi ,..., 9 P ) £ 5ft p+1 collects the autoregressive parameters, while 
4> = fa, ■ ■ ■, <f> q ) £ 5i 9 , £{•) is the function linking the lagged exogenous 
information or past returns yi-t-i = {yi, 3/2 , • ■ •, Vt-i) to the current conditional 
quantile and the error term in the measurement Equation (1221) is such that its 
r-th level quantile is equal to zero, i.e. q T (et | = 0 in order to ensure that 

ft {0 T ) is the T-th level conditional quantile of the observed variable yt given 
the observations up to time t— 1. As noted bv lEngle and Manganelli ( 2004 ). the 
smooth evolution of the quantile over time is guaranteed by the autoregressive 
terms denoted by 0ift-i{0), i = 1,2while the function £(■) can be 
interpreted as the News Impact Curve (NIC) introduced bv lEngle and Ng (1993) 
for ARCH-type models. 

The estimation procedure of the r-th regression quantile in the frequentist 
approach is based on the minimisation of the following loss function 


mj n X^ ( 2 * _ ft (&-)) 


(24) 


with p T {u) = u{t — 1 (it < 0)). Alternatively, the Asymmetric Laplace dis¬ 
tribution (ALD) can be used as misspecified likelih ood functio n to perform 
maximum likelihood inference as suggested bv lYu and Moveedl ( 2001 ) from a 
Bayesian perspective. They also prove that, under improper prior for the re¬ 
gression parameters /3 T , the Baysian Maximum a Posteriori estimate (MaP) 
coincides with the solution of the minimisation problem in equation (1241) . Some 
examples of specifications of the CAViaR dynamic in equation (l23l) have been 
introduced in the seminal paper of Engle and Manganellil ( 2004 ): 


(i) symmetric absolute value: 


ft (/3) = 0i+ 02 ft -1 (/3) + 03\yt- 


i|) 


(25) 


(ii) asymmetric slope 

ft (/3) =0i + 02ft- 1 {0) + 03Vt—i H[o,-t-oo) {yt- 1 ) + 0Ayt—i H(—oc,o) {yt- 1 ) > 

(26) 

(in) indirect GARCH (1,1) 

ft {0) = [01 + 02 fl, {0) + 03VU] h , (27) 


(iv) adaptive 


ft {0) = ft -1 {0) 


0i 


1 + exp {G {y t -1 - ft- 1 {0))} ’ 
where G £ 5i + is a positive constant. 


(28) 


The different specifications introduced by the original paper of Engle and Man g anell i 
1 20041) can be estimated using the code available at the first author’s web page: 

http://www.simonemanganelli.org/Simone/Research.html 


13 




























4. The MCS package 


As described in Section 0 the MCS procedure is used to compare different 
models under an user defined loss function. The loss function measures the 
“performance” of the competing models at a each time point t = 1,2,... in 
the evaluating period. Suppose now to compare m alternative models over 
the evaluating period of length n, then the loss function defined in Equation [T] 
delivers a loss matrix of dimension (m x n) containing, for each time < = 1 , 2 ,..., 
the losses associated to each competing model. The R function MCSprocedure () 
is then used to construct the set of superior models outlined in Section [2] 

f.l. Comparing models using the MCS routine 

The MCS procedure can be used to compare models under various aspects. For 
example, it can be used to assess the models’ ability to predict future volatil¬ 
ity levels or future returns, conditional to actual and past information. The 
object “Loss” represents the main input of the implemented MCSprocedure () 
function. It consists of a matrix of dimension (m x n), where m is the number 
of competing models in the initial set M° and n is the number of observations 
in the evaluation period. In the next section, we describe some alternative loss 
functions specifications which are particularly suitable for volatility forecast as¬ 
sessment as well as to forecast future observations. 


4.2. Loss functions 

As previously discussed, the MCS procedure is able to discriminate models 
under a user defined loss function. The choice of the loss function is somewhat 
arbitrary, and it crucially depends on the nature of the competing models and 
the scope of their usage. For more considerations about the cho ice of the loss 
function for model comparison purposes, we refer to Hansen and Lunde ( 20051 ). 
Bollerslev et al. (1994), Diebold and Lopez ( 19961) and iLopeX i 2001 1. In what 


follows, we report the loss functions available within the MCS package. However, 
since the MCSprocedure () function accepts as input a pre-defined loss matrix, 
named “Loss”, the user is free to define and use its own loss function. The 
following loss functions are freely available within the MCS package: 


1. the LossVaRO that can be used to check the performances associated to 
VaR, or, more generally, quantile forecasts; 

2. the LossVolO for volatility forecasts assessment; 

3. the LossLevelO that can be used instead for level forecasts, as the punc¬ 
tual mean forecasts of a regression model. 


These loss functions accept as inputs common and function specific arguments. 
The common arguments are “realized”, that consists of a vector of realised 
observations, i.e. the ones that a model hopes to accurately forecast or de¬ 
scribe, and “evaluated”, which is a vector or a matrix of models output. It 
is worth noting that we decided to call the second argument of those functions 
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evaluated” instead of “forecasted” since the MCS procedure is more gen¬ 
eral th an a si mple procedure for forecasts evaluation. In fact, as reported by 


Hansen et al.1 (12011 1. the MCS procedure also adapts to in sample studies. The 
third argument, “which” , instead is function specific. The available choices for 
the common and function specific arguments are reported below. 


1. Concerning the LossVaRO function, the only available argument is which 
= "as ymmetricLoss". This coincid es with the asymmetric VaR loss func¬ 
tion of lGonzalez-Rivera et all ( 2004 1 defined in equation [2] which is used to 
assess quantile-based risk measures, such as the VaR, because it penalises 
more heavily observations below the r-th level quantile, i.e. yt < VaR}. 
Further arguments of the LossVaRO function are: “tau” , which represents 
the VaR confidence level, and type={"normal" , "differentiable"}. The 
“type” argument allows to discriminate between the normal and the dif¬ 
ferentiable vers ions of the loss function . The choice “normal” specifies the 
loss function of Gonzalez-Rivera et al. ( 20041) defined in equation [2} while 
the option “differentiable” considers the following loss function 


£ (r t , VaR }) ~ (t -mg{r t , VaR})) (r t - VaR}), (29) 

where ms(a,b) = [1 + exp {<5 (a — 6)}] _1 . Note that the parameter S, 
controlling for the function smoothness, it is fixed to the default value of 
25, but different value can be specified by the user. 


2. Concerning the LossVolO, we implemented the functions reported in 
Hansen and Lunde 020051 1. Note that for this kind of loss functions the 


realized and evaluated arguments should be some realised volatility 
measures tft+i and the punctual volatility forecasts <x t+ 1 . In this context, 
we use the term volatility as for the standard deviation. The implemented 
loss functions are: 


2.1 SEi )t+ i = (<j t+ i - <J t +i) 2 , by setting which = "SE1"; 

2.2 SE 2l t+i = (df +1 — bf +1 ) 2 , by setting which = "SE2"; 

2.3 QLIKE t+1 = log (<5f +1 ) + d'i+i^t+ii by setting which = "QLIKE"; 

2.4 R 2 LOG t+ i = [log (af^cr}^)] 2 , by setting which = "R2L0G"; 

2.5 AEi jt+ i = leq+i — <7t+i|, by setting which = "AE1"; 

2.6 AE 2 ,t+i = |of +1 — df +1 |, by setting which = "AE2". 

3. Concerning the LossLevelO function, the only available argument is 
which = {"SE", "AE"}. The two options coincide with the squared error 
and the absolute error, respectively. 


4-3. Constructing the SSM 

The function MCSprocedure () is the main routine for the MCS procedure. It 
returns a R “S4” object of the class “SSM” , which has several arguments we now 
briefly describe here. The main inputs of the function MCSprocedure () are: 
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- "Loss", a matrix or something coercible to that (using the as.matrixQ 
function), containing the loss series for each competing model; 

- "alpha", a positive scalar in (0,1) indicating the confidence level of the 
MCS tests; 

- "B", an integer indicating the number of bootstrapped samples used to 
construct the statistic test; 

- "cluster", a cluster object created by calling makeCluster () from the R 
parallel package. The default option for "cluster" is set to NULL, other¬ 
wise the user supplied cluster object is employed in the parallel processing 
routine. 

- "statistic", defines the test-statistic that is used to test the EPA. Possi¬ 
ble choices are “Tmax” and “TR”, which coincide with T maXi M and Tr^m 
statistics defined in Section [2] 


5. Application 


In this empirical study, a panel of four major worldwide stock markets indexes 
is considered. The four daily stock price indices includes the Asia/Pacific 600 
(SXP1E), the North America 600 (SXA1E), the Europe 600 (SXXP) and the 
Global 1800 (SXW1E). The data are freely available and can be downloaded 
from the STOXX website http : //www. stoxx. com/indices/types/benchmark .html 
The data were obtained over a 23-years time period, from December 31, 1991 
to July 24, 2014, comprising a total of 5874 observations. For each market, 
the returns are calculated as the logarithmic difference of the daily index level 
multiplied by 100 

Vt = (log ( p t ) - log (pt-i)) x 100, 


where pt is the closing index value on day t. To examine the performance of the 
models to predict extreme VaR levels, the complete dataset of daily returns is 
divided into two samples: an in-sample period from January 1, 1992 to October 
6, 2006, for a total of 3814 observations, and a forecast or validation period, con¬ 
taining the remaining 2000 observations: from October 9, 2006 to July 24, 2014. 
A rolling window approach is then used to produce 1-day ahead forecasts of the 
5% VaR thresholds, VaR^ 0 ®, for t = 1, 2,..., 2000 of the considered series in the 
forecast sample. Table |T] reports some descriptive statistics for the in sample 
as well as the out of sample period. As expected, we found evidence of depar¬ 
ture from normality, mainl y because a ll t he se ries appear to be leptokurtic and 
skewed. Moreover, the ljarque and Beral ( 1980 ) statistic test strongly rejects the 
null hypothesis of normality for all the considered series. It is interesting to note 
that, the departure from nor mality, is stronger for the out of sample returns. 
As widely discussed bv IShilleJ ( 2012 1. this empirical evidence can be considered 
as an effect of the recent GFC of 2007 2008 that affected the overall economy. 
Furthermore, the unconditional distribution of each return’s series, in the out of 
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Index 


Min Max Mean Std. Dev. Skewness Kurtosis 5% Str. Lev. JB 


In-sample, from 02/01/1992 to 06/10/2006 


SXA1E 

-8.05 

7.79 

0.03 

1.28 

-0.07 

5.82 

-2.08 

1269.17 

SXP1E 

-5.80 

9.71 

0.01 

1.29 

0.10 

5.90 

-1.99 

1341.19 

SXW1E 

-5.54 

5.02 

0.03 

0.99 

-0.07 

5.57 

-1.60 

1055.46 

SXXP 

-6.41 

5.64 

0.03 

1.05 

-0.27 

6.82 

-1.68 

2376.35 

Index 

Min 

Max 

Mean 

Std. Dev. 

Skewness 

Kurtosis 

1% Str. Lev. 

JB 


Out-of-sample, from 09/10/2006 to 24/06/2014 


SXA1E 

-9.18 

9.96 

0.02 

1.38 

-0.28 

11.32 

-2.13 

5812.50 

SXP1E 

-7.73 

9.42 

0.00 

1.28 

-0.33 

8.72 

-2.02 

2768.92 

SXW1E 

-6.81 

8.37 

0.01 

1.05 

-0.28 

10.43 

-1.62 

4641.35 

SXXP 

-7.93 

9.41 

0.00 

1.33 

-0.11 

9.61 

-2.06 

3655.56 


TABLE 1: Summary statistics of the panel of international indexes, for the in sample and 
out of sample period. The seventh column, denoted by “5% Str. Lev.” is the 5% empirical 
quantile of the returns distribution, while the eight column, denoted by “JB” is the value of 
the Jarque-Bera test-statistics 


sample period, is negatively skewed and shows higher standard deviation as well 
as higher kurtosis. The 5% unconditional quantile, which represents the VaR 
at r = 5% under the iid assumption of the returns’ conditional distribution, 
has been moved further to the left tail, in the second part of the sample. The 
changes in the behaviour of the considered panel of returns suggest that the 
SSM would contain those models that are more flexible to describe the impact 
of the GFC to the returns’ conditional distributions during the validation pe¬ 
riod. 

As previously said, our empirical application focuses on the ability to fore¬ 
cast the Value-at-Risk of several competing models, at a given confidence level 
t. To apply the MCS procedure, we forecast the VaR at r = 5% using the mod¬ 
els described in Section [3] estimated on each of the four indexes. More precisely, 
we consider eight different GARCH(1,1) specifications (GARCH, EGARCH, 
APARCH, AVARCH, GJRGARCH, TGARCH, NGARCH, CGARCH) with 
Gaussian and Student-t innovations detailed in Section 13.11 the GAS-A f and 
the GAS-T models described in section 13.21 and four CaViaR model specifi¬ 
cations reported in Section 13.31 comprising a total of 22 models. Estimated 
coefficients for each model are not reported to save space, but they are available 
upon request to the second author. For the GARCH and the GAS models, VaR 
estimates are performed by inverting the corresponding conditional cumulative 
density function, while the CAVia R specificati on repor ts directly the quantile 
estimates. The MCS procedure of Hansen et all ( 201lh described in Section [2] 
is then applied to obtain the set of models with superior predictive ability in 
term of the supplied VaR forecasts. 


For each index, Table [3] reports the compositions of the SSM. Naturally, the 
higher the number of eliminated models, the higher the heterogeneity of the 
competing forecasts. On the contrary, if the final SSM contains a big portion 
of the starting M° set, then the competing model are statistically equivalent in 
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Model 

RankR,M 

Uj 

p - value R M 

Rank maXj M 

U 

p — value ma 

xM LossxlO 3 

SXA1E: 5 eliminations 







GJRGARCH-T 

1 

-1.73 

1.00 

1 

-0.45 

1.00 

38.90 

GJRGARCH-vV 

2 

-1.28 

1.00 

2 

0.45 

1.00 

39.20 

AVGARCH-A/" 

3 

-0.99 

1.00 

3 

0.67 

1.00 

39.40 

APARCH-T 

4 

-0.96 

1.00 

4 

0.69 

1.00 

39.40 

APARCH-A/" 

5 

-0.52 

1.00 

5 

0.98 

0.98 

39.70 

EGARCH-T 

6 

-0.20 

1.00 

6 

1.19 

0.90 

39.80 

TGARCH-T 

7 

0.01 

1.00 

7 

1.30 

0.82 

39.90 

NG ARCH-7" 

8 

0.26 

1.00 

8 

1.42 

0.70 

40.10 

GARCH-T 

9 

0.31 

1.00 

9 

1.45 

0.67 

40.10 

GAS-T 

10 

0.39 

1.00 

10 

1.49 

0.62 

40.10 

NGARCH-A/" 

11 

0.70 

0.99 

11 

1.60 

0.48 

40.30 

AVGARGH-T 

12 

0.72 

0.98 

13 

1.68 

0.38 

40.30 

GARCH-A/" 

13 

0.80 

0.96 

12 

1.66 

0.40 

40.40 

TGARCH-A/" 

14 

1.26 

0.59 

14 

1.90 

0.11 

40.70 

CGARCH-T 

15 

1.39 

0.42 

15 

1.98 

0.05 

40.70 

SXP1E: 2 eliminations 







GAS-r 

1 

-1.94 

1.00 

1 

-0.79 

1.00 

38.80 

GJRGARCH-A/" 

2 

-1.19 

1.00 

2 

0.78 

1.00 

39.50 

AVGARGH-T 

3 

-1.05 

1.00 

3 

0.84 

1.00 

39.50 

gas-a r 

4 

-0.83 

1.00 

4 

1.01 

0.99 

39.70 

APARCH-A/" 

5 

-0.60 

1.00 

5 

1.15 

0.96 

39.80 

AVGARCH-A/" 

6 

-0.50 

1.00 

6 

1.16 

0.95 

39.90 

GJRGARCH-T 

7 

-0.50 

1.00 

7 

1.23 

0.92 

39.90 

APARCH-T 

8 

-0.13 

1.00 

8 

1.43 

0.76 

40.10 

EGARGH-T 

9 

0.26 

1.00 

10 

1.62 

0.55 

40.30 

EGARCH-A/" 

10 

0.32 

1.00 

9 

1.62 

0.55 

40.40 

TGARCH-A/" 

11 

0.39 

1.00 

11 

1.64 

0.52 

40.40 

GARCH-A/" 

12 

0.42 

1.00 

12 

1.65 

0.49 

40.40 

CGARCH-A/" 

13 

0.59 

1.00 

13 

1.73 

0.41 

40.50 

TGARCH-T 

14 

0.67 

0.99 

14 

1.79 

0.30 

40.60 

NGARCH-A/" 

15 

0.96 

0.93 

15 

1.90 

0.19 

40.80 

CGARCH-T 

16 

1.08 

0.87 

16 

2.00 

0.11 

40.80 

GARCH-T 

17 

1.11 

0.85 

17 

2.00 

0.10 

40.90 

NGARCH-T 

18 

1.40 

0.61 

18 

2.14 

0.01 

41.10 

SXW1E: 7 eliminations 







GJRGARCH-T 

1 

-1.61 

1.00 

1 

-0.36 

1.00 

28.80 

APARCH-T 

2 

-1.21 

1.00 

2 

0.36 

1.00 

29.00 

EGARCH-T 

3 

-0.97 

1.00 

3 

0.54 

1.00 

29.10 

TGARCH-T 

4 

-0.79 

1.00 

4 

0.67 

1.00 

29.20 

APARCH-A/" 

5 

-0.37 

1.00 

5 

0.93 

0.98 

29.30 

GJRGARCH-A/ 

6 

-0.33 

1.00 

6 

0.96 

0.97 

29.30 

TGARCH-A/" 

7 

0.06 

1.00 

7 

1.16 

0.90 

29.40 

AVGARCH-T 

8 

0.28 

1.00 

8 

1.31 

0.76 

29.50 

EGARCH-A/" 

9 

0.78 

0.94 

9 

1.56 

0.42 

29.70 

NGARCH-T 

10 

0.85 

0.91 

10 

1.62 

0.33 

29.70 

GARCH-T 

11 

0.90 

0.88 

11 

1.65 

0.29 

29.80 

AVGARCH-A/" 

12 

1.13 

0.64 

12 

1.76 

0.15 

29.90 

CGARCH-T 

13 

1.21 

0.53 

13 

1.82 

0.09 

29.90 

SXXP: 15 eliminations 







TGARCH-T 

1 

1.25 

1.00 

1 

0.35 

1.00 

36.60 

AVGARCH-T 

2 

0.67 

1.00 

2 

0.35 

1.00 

36.70 

EGARCH-T 

3 

0.12 

1.00 

3 

0.75 

0.99 

36.80 

APARCH-T 

4 

0.85 

0.78 

4 

1.33 

0.32 

37.00 

AVGARCH-A/' 

5 

1.16 

0.19 

5 

1.46 

0.17 

37.10 


Tabic 2: Comparison of the SSMs for the four considered international stock indexes. The 
p—■values of the Trm and T max m statistics, are reported in the third and seventh columns, 
respectively. The p-value of the test statistic, is equal to the minimum of the overall p-values. 
The columns RankR m and Rank max? M report the ranking over the models belonging to the 
SSMs. Finally, the last column LossxlO 3 is the average loss across the considered period. 
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VciR,Dy n 



"VaRAvg 


Asset 

AE 

ADmean 

ADmax 

AE 

ADmean 

ADmax 

SXA1E 

1.55 

0.698 

3.121 

2.05 

0.700 

3.315 

SXP1E 

1.10 

0.950 

3.787 

1.45 

0.798 

3.766 

SXW1E 

1.65 

0.420 

1.974 

2.30 

0.426 

2.051 

SXXP 

1.95 

0.514 

2.520 

2.50 

0.515 

2.836 


Table 3: VaR backtesting measures of the dynamic VaR combination VaRD yn and the static 
average VaRAvg- 


term of their forecast ability of future VaR levels. The p-values of the Tr^m 
and T max jy[ statistics, are reported in the second and sixth columns, respec¬ 
tively. The p-value of the test statistic, is equal to the minimum of the overall 
p-values reported in the fourth and seventh columns of Table O respectively. 
For a detaile d discu ssion about the interpretation of the MCS p-values we re¬ 
fer to Hansen et al.1 ( 2011 ). The estimated SSMs differ for the number of the 
eliminated models as well as for their compositions. We can observe that, for 
the SXP1E index, only 2 models were eliminated by the MCS procedure. This 
empirical finding highlights the statistical equivalence of forecasting future VaR 
levels using a simple model such as the GARCH(1,1)-VV or a more sophisticated 
one like the GAS-T. Furthermore, this evidence suggests that, the SXP1E in¬ 
dex may not be affected by some stylised facts such as the leverage effect or by 
complex nonlinear conditional volatility dynamics. For the SXP1E and SXW1E 
indexes the MCS procedure eliminates five and seven models, respectively. An 
higher level of discrimination among models is instead evident for the SXXP 
index. In fact, in that case, 15 of the 20 considered models do not belong to the 
final SSM and the five remaining models are characterised by strong nonlinear 
dynamics for the conditional volatility process. Moreover, 4 of the 5 remain¬ 
ing models are characterised by a Student-t distribution, meaning that, for the 
SXXP index, a more leptokurtic conditional distribution is required to improve 
future VaR forecasts. At a first glance, it would seem strange to observe that 
the composition of the SSM is quite homogeneous with respect to the condi¬ 
tional distribution assumption. Indeed, apart from the SXXP index, all series 
appear to be well described by a Gauss ian, as well as by a Student-t distribu¬ 
tion. However, as discussed bv I.Tondeau et al. ( 2007 b it should be noted that 
the Gaussian assumption for the innovations does not implies gaussianity for the 
unconditional distribution of the returns. Concerning the empirical relevance 
of the distribution assumption, the considered return series can be viewed as 
highly diversified portfolios. Well diversified portfolios are characterised by the 
fact that positive and negative tail events affecting the conditional distribution 
and its kurtosis, are mitigated by the diversification. The need for an heavy¬ 
tailed distribution to describe the SXXP return index is probably due to the 
higher impact that the GFC had to the European economy as compared to the 
other countries. Finally, it is interesting to note that, all the CAViaR specifica- 
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tions are always excluded from the SSM, suggesting the CAViaR dynamics do 
not adequately describe future VaR levels. 

In order to test the benefits of using the MCS procedure, we also compare two 
different VaR aggregation techniques. The first aggregation technique, VaR avg , 
simply averages the VaR forecasts delivered by the 22 competing models, while 
the second one, VaRD yn , is inste ad performed by appl ying the dynamic VaR 
combination method proposed in IBernardi et al. (12014 ) to the VaR forecasts 
delivered by the MCS models. The dynamic VaR combination technique, in¬ 
stead, averages VaR forecasts using a dynamic updating rule on each model’s 
relative contribution to the total quantile loss. Table [3] reports the backtest¬ 
ing performances of the two VaR aggregation methods. Three VaR backtesting 
measures are considered. The first is the Actual over Expected ratio AE, defined 
as the ratio between the realised VaR exceedances over a given time horizon and 
their “a priori” expected values; VaR forecasts series for which the AE ratio is 
closer to the unity are preferred. The second and the third backtesting measures 
are the mean and maximum Absolute Deviation (ADmean and ADmax) of VaR 
violating returns described in McAleer and da Veiga ( 2008lh The AD in general 
provides a measure of the expected loss given a VaR violation; of course models 
with lower mean and/or maximum ADs are preferred. As showed in Table [3l ex¬ 
cept for the SXP1E index, the VaRD yn series always report lower ADmean and 
ADmax compared with the VaR avg , while the AE ration is strongly improved 
for all the considered indexes. 


6. Conclusion 


This paper compares alternative model specifications in term of their VaR fore¬ 
casting performances. The mo del compariso n is p erformed using the MCS 
procedure recently proposed bv lHansen et al. ( 201lh . The MCS technique is 
particularly useful when several different models are available and it is not 
obvious which one performs better. The MCS sequence of tests delivers the 
Superior Set of Models having Equal Predictive Ability in terms of an user 
supplied loss function discriminating models. This flexibility helps to discrimi¬ 
nate models with respect to desired characteristics, such as, for example, their 
forecasting performances. In our empirical application, we compare the VaR 
forecast ability of several models. More specific ally, the direct quantile esti - 
mates obtained by the dynamic CAViaR models of Engle and Manganelli (2004) 
are compared with the VaR forecast delivered by several ARCH type mod¬ 
els of Engle ( 1982h and with those obtained by two di fferent specificatio ns of 


the Generalise d Autoregressive Score (GAS) models of ICreal et al.1 (|2013D and 
Harv ey ( 20131) . The MCS procedure is firstly performed to reduce the initial 


number of models, and then to show that accounting for the VaR dynamic 
model averaging technique of Bernardi et al.1 ( 2014ll improves the VaR forecast 
performance. Our empirical results, suggest that, after the Global Financial 
Crisis (GFC) of 2007-2008, highly non linear volatility models are preferred 
by the MCS procedure for the European countries. On the contrary, quite 
homogenous results, with respect to the models’ complexity, were found for 
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the the North America and Asia Pacific regions. The paper also illustrates 
the main features of the provided R package MCS available on the CRAN reposi¬ 
tory, http://cran.r-proj ect.org/web/packages/MCS/index.html The MCS 
package is very flexible since it allows for the specification of the model’s types 
and loss functions that can be supplied by the user. This freedom allows for the 
user to concentrate on substantive issues, such as the construction of the initial 
set of model’s specifications M°, without being limited by the the software con¬ 
strains. 


Acknowledgments 

This research is supported by the Italian Ministry of Research PRIN 2013- 
2015, “Multivariate Statistical Methods for Risk Assessment” (MISURA), and 
by the “Carlo Giannini Research Fellowship”, the “Centro Interuniversitario 
di Econometria” (CldE) and “UniCredit Foundation”. In the development of 
package MCS we have benefited from the suggestions and help of several users. 
In particular, we would like to thank Riccardo Sucapane for his constructive 
comments on previous drafts of this work. Our sincere thanks go to all the 
developers of R since without their continued effort and support no contributed 
package would exist. 


References 

Andres, P. (2014). Maximum likelihood estimates for positive valued dynamic 
score models; the dysco package. Computational Statistics & Data Analysis, 
76(0):34 - 42. CFEnetwork: The Annals of Computational and Financial 
Econometrics 2nd Issue. 

Bates, J. M. and Granger, C. W. (1969). The combination of forecasts. OR, 
pages 451-468. 

Bauwens, L., Laurent, S., and Rombouts, J. V. K. (2006). Multivariate garch 
models: a survey. Journal of Applied Econometrics, 21 (1):79—109. 

Bernardi, M. and Catania, L. (2015). Switching-GAS Copula Models for Sys¬ 
temic Risk Assessment. Working Paper. 

Bernardi, M., Catania, L., and Petrella, L. (2014). Are News Important to 
Predict Large Losses ? Working Paper, Arxiv Preprint. 

Black, F. (1976). Studies of stock price volatility changes. In Proceedings of the 
1976 American Statistical Association, Business and Economical Statistics 
Section, Alexandria, VA: American Statistical Association, pages 177- 181. 

Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. 
Journal of Econometrics, 31 (3):307 - 327. 


21 



Bollcrslev, T. (2008). Glossary to ARCH (GARCH). CREATES Research 
Papers 2008-49, School of Economics and Management, University of Aarhus. 

Bollcrslev, T., Engle, R. F., and Nelson, D. B. (1994). Arch models. Handbook 
of econometrics , 4:2959-3038. 

Box, G. E. and Cox, D. R. (1964). An analysis of transformations. Journal of 
the Royal Statistical Society. Series B (Methodological), pages 211-252. 

Caporin, M. and McAleer, M. (2014). Robust ranking of multivariate {GARCH} 
models by problem dimension. Computational Statistics & Data Analysis, 
76(0): 172 - 185. CFEnetwork: The Annals of Computational and Financial 
Econometrics 2nd Issue. 

Chen, Q., Gerlach, R., and Lu, Z. (2012). Bayesian value-at-risk and expected 
shortfall forecasting via the asymmetric laplace distribution. Computational 
Statistics & Data Analysis, 56(11):3498-3516. 

Chen, Q. and Gerlach, R. H. (2013). The two-sided weibull distribution and 
forecasting financial tail risk. International Journal of Forecasting, 29(4) :527 
540. 

Clark, T. E. and McCracken, M. W. (2001). Tests of equal forecast accuracy 
and encompassing for nested models. Journal of econometrics, 105(1):85-110. 

Cox, D. R., Gudmundsson, G., Lindgren, G., Bondesson, L., Harsaae, E., Laake, 
P., Juselius, K., and Lauritzen, S. L. (1981). Statistical analysis of time series: 
Some recent developments [with discussion and reply]. Scandinavian Journal 
of Statistics, pages 93-115. 

Creal, D., Koopman, S. J., and Lucas, A. (2013). Generalized autoregressive 
score models with applications. Journal of Applied Econometrics, 28(5):777 
795. 

Dellc Monache, D. and Petrella, I. (2014). Adaptive models and heavy tails. 
Technical report, Queen Mary, University of London, School of Economics 
and Finance. 

Diebold, F. X. and Lopez, J. A. (1996). Forecast evaluation and combination. 

Diebold, F. X. and Mariano, R. S. (2002). Comparing predictive accuracy. 
Journal of Business & economic statistics, 20(1). 

Ding, Z., Granger, C. W., and Engle, R. F. (1993). A long memory property of 
stock market returns and a new model. Journal of Empirical Finance, 1(1):83 
106. 

Engle, R. (2002). New frontiers for arch models. Journal of Applied Economet¬ 
rics, 17(5):425-446. 


22 



Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with esti¬ 
mates of the variance of united kingdom inflation. Econometrica , 50(4):987 
1007. 

Engle, R. F. and Bollerslev, T. (1986). Modelling the persistence of conditional 
variances. Econometric Reviews, 5(1):1 50. 

Engle, R. F. and Gallo, G. M. (2006). A multiple indicators model for volatility 
using intra-daily data. Journal of Econometrics, 131 (1):3—27. 

Engle, R. F. and Lee, G. G. (1993). A permanent and transitory component 
model of stock return volatility. University of California at San Diego, Eco¬ 
nomics Working Paper Series. 

Engle, R. F. and Manganelli, S. (2004). Caviar. Journal of Business & Economic 
Statistics, 22(4):367-381. 

Engle, R. F. and Ng, V. K. (1993). Measuring and testing the impact of news 
on volatility. The Journal of Finance, 48(5):1749-1778. 

Engle, R. F. and Russell, J. R. (1998). Autoregressive conditional duration: 
a new model for irregularly spaced transaction data. Econometrica, pages 
1127-1162. 

Francq, C. and Zakoian, J.-M. (2011). GARCH models: structure, statistical 
inference and financial applications. John Wiley & Sons. 

Gallant, A., Hsieh, D., and Tauchen, G. (1997). Estimation of stochastic volatil¬ 
ity models with diagnostics. Journal of Econometrics, 81(1):159 - 192. 

Giacomini, R. and White, H. (2006). Tests of conditional predictive ability. 
Econometrica, 74(6):1545-1578. 

Glosten, L. R., Jagannatlran, R., and Runkle, D. E. (1993). On the relation 
between the expected value and the volatility of the nominal excess return on 
stocks. The Journal of Finance, 48(5):1779-1801. 

Gonzalez-Rivera, G., Lee, T.-H., and Mishra, S. (2004). Forecasting volatility: 
A reality check based on option pricing, utility function, value-at-risk, and 
predictive likelihood. International Journal of Forecasting, 20(4):629 645. 

Hansen, P. R. (2005). A test for superior predictive ability. Journal of Business 
& Economic Statistics, 23(4). 

Hansen, P. R. and Lunde, A. (2005). A forecast comparison of volatility mod¬ 
els: does anything beat a garch(l,l)? Journal of Applied Econometrics, 
20(7):873-889. 

Hansen, P. R., Lunde, A., and Nason, J. M. (2003). Choosing the best volatility 
models: The model confidence set approach. Oxford Bulletin of Economics 
and Statistics, 65(sl):839-861. 


23 



Hansen, P. R., Lunde, A., and Nason, J. M. (2011). The model confidence set. 
Econometrica , 79(2):453-497. 

Harvey, A. and Sucarrat, G. (2014). {EGARCH} models with fat tails, skewness 
and leverage. Computational Statistics & Data Analysis, 76(0):320 - 338. 
CFEnetwork: The Annals of Computational and Financial Econometrics 2nd 
Issue. 

Harvey, A. C. (2013). Dynamic Models for Volatility and Heavy Tails: With 
Applications to Financial and Economic Time Series. Cambridge University 
Press. 

Harvey, A. C. and Shephard, N. (1996). Estimation of an asymmetric stochastic 
volatility model for asset returns. Journal of Business & Economic Statistics, 
14(4):429-434. 

Higgins, M. L. and Bera, A. K. (1992). A class of nonlinear arch models. 
International Economic Review, pages 137 -158. 

Jarque, C. M. and Bera, A. K. (1980). Efficient tests for normality, homoscedas- 
ticity and serial independence of regression residuals. Economics Letters, 
6(3):255-259. 

Jondeau, E., Poon, S.-H., and Rockinger, M. (2007). Financial modeling under 
non-Gaussian distributions. Springer Science & Business Media. 

Kilian, L. (1999). Exchange rates and monetary fundamentals: What do 
we learn from long-horizon regressions? Journal of Applied Econometrics, 
14(5):491-510. 

Koenker, R. and Bassett Jr, G. (1978). Regression quantiles. Econometrica: 
journal of the Econometric Society, pages 33-50. 

Koenker, R. and Xiao, Z. (2006). Quantile autoregression. Journal of the Amer¬ 
ican Statistical Association, 101(475):980-990. 

Koopman, S. J., Lucas, A., and Scharth, M. (2015). Predicting time-varying 
parameters with parameter-driven and observation-driven models. Review of 
Economics and Statistics, forthcoming. 

Kowalski, T. and Shachmurove, Y. (2014). The reaction of the u.s. and the 
european monetary union to recent global financial crises. Global Finance 
Journal , 25(1):27 - 47. 

Lopez, J. A. (2001). Evaluating the predictive accuracy of volatility models. 
Journal of Forecasting, 20(2):87 -109. 

Lucas, A. and Zhang, X. (2014). Score driven exponentially weighted moving 
averages and value-at-risk forecasting. Tinbergen Institute Discussion Paper, 
n. U-092. 


24 



McAleer, M. and da Veiga, B. (2008). Single-index and portfolio models for 
forecasting value-at-risk thresholds. Journal of Forecasting, 27(3):217-235. 

Nelson, D. B. (1991). Conditional Heteroskedasticity in Asset Returns: A New 
Approach. Econometrica , 59(2):347 70. 

R Development Core Team (2013). R: A Language and Environment for Statis¬ 
tical Computing. R Foundation for Statistical Computing, Vienna, Austria. 
ISBN 3-900051-07-0. 

Romano, J. P. and Wolf, M. (2005). Stepwise multiple testing as formalized 
data snooping. Econometrica, 73(4):1237-1282. 

Russell, J. R. (1999). Econometric modeling of multivariate irregularly-spaced 
high-frequency data. Manuscript, GSB, University of Chicago. 

Salvatierra, I. D. L. and Patton, A. J. (2014). Dynamic copula models and high 
frequency data. Journal of Empirical Finance. 

Samuels, J. D. and Sekkel, R. M. (2011). Forecasting with large datasets: Trim¬ 
ming predictors and forecast combination. Working paper. 

Samuels, J. D. and Sekkel, R. M. (2013). Forecasting with many models: Model 
confidence sets and forecast combination. Technical report, Bank of Canada 
Working Paper. 

Schwert, G. (1990). Stock volatility and the crash of ’87. Review of Financial 
Studies, 3(1):77 102. 

Shiller, R. J. (2012). The subprime solution: How today’s global financial crisis 
happened, and what to do about it. Princeton University Press. 

Silvennoinen, A. and Terasvirta, T. (2009). Multivariate garclr models. In 
Handbook of Financial Time Series, pages 201-229. Springer. 

Taylor, S. J. (1986). Modelling Financial Times Series. Wiley. 

Taylor, S. J. (1994). Modeling stochastic volatility: A review and comparative 
study. Mathematical Finance, 4(2): 183-204. 

Terasvirta, T. (2009). An introduction to univariate garch models. In Handbook 
of Financial Time Series , pages 17-42. Springer. 

West, K. D. (1996). Asymptotic inference about predictive ability. Economet¬ 
rica: Journal of the Econometric Society, pages 1067 1084. 

White, H. (2000). A reality check for data snooping. Econometrica, 68(5):1097 
1126. 

Yu, K. and Moyeed, R. (2001). Bayesian quantile regression. Statistics & Prob¬ 
ability Letters, 54:437-447. 

Zakoian, J.-M. (1994). Threshold heteroskedastic models. Journal of Economic 
Dynamics and Control, 18(5):931 955. 


25 



